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ABSTRACT 


r 


A  great  deal  of  research  has  beer  conducted  in  the  past 
20  years  concerning  the  use  of  voice  recognition  eauipniett 
with  computers.  The  goal  of  this  research  has  been  to 
improve  the  man-machine  interface.  Kith  the  breakthrough 
iron  discrete  to  continuous  voice  recognition  technology  in 
the  1970’s,  a  large  step  toward  that  goal  was  taken. 

This  thesis  attempts  to  show  that  continuous  voice 
recognition  technology  can  be  effectively  applied  ir.  a 
higniy  interactive,  computer-aided  wargaming  environment. 
Through  analysis  of  the  strictly-formatted  command  syntax  o 
the  Naval  Warfare  Interactive  Simulation  System  (NSISS)  and 
use  of  ccamercialiy  available,  innovative,  continuous  sgeec 
hardware  and  software,  a  new  input  medium  was  created  for 
the  users  of  that  wargaae.  The  true  effectiveness  of  this 
application  of  voice  recognition  technology  must  still  be 
tested.  Plans  for  such  testing  are  oeing  made  and,  to  that 
extent,  the  thesis  objectives  are  partly  met. 
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I.  INTRODUCTION 


As  computer  science  has  advanced  into  the  era  of  huxan 
interactive  design,  cne  technology  which  has  received 
increasing  attention  and  has  already  demonstrated  cany  prac¬ 
tical  results  is  that  of  speecn  recognition.  It  has  the 
potential  to  vastly  change  the  state  of  man-machine  interac¬ 
tion  by  allowing  humans  to  use  their  most  natural 
communications  output  mode,  speech,  and  to  us  freeing  them 
from  the  constraints  of  the  Keycoard.  This  tmesis  is 
concerned  witn  the  application  of  speech  recognition  tech¬ 
nology  to  military  wargauing,  specifically  a  computer-aided 
simulation  cf  the  naval  warfare  environment  Known  as  the 
Naval  Warfare  Interactive  Simulation  System  (NWISS). 

3ecause  computer-aided  wargames  entail  a  very  high  degree  of 
human  interaction,  they  are  excellent  candidates  for  appli¬ 
cation  of  voice  recognition  tecnnology.  The  remainder  of 
this  chapter  will  cover  what  speecn  recognition  technology 
is  and  can  do,  describe  a  specific  implementation  of  this 
technology  in  a  product  named  the  Verbax  3000,  introduce 
N'/ISS  in  core  detail  and  close  witn  a  summary  of  the  thesis 
ob  j  ecti ves. 

A.  REVIER  CF  70ICE  RECOGNITION  TECHNOLOGY 
1.  General 

In  the  discussion  of  automatic  speecn  recognition, 
the  distinction  between  recognition  anu  understanding  is 
sometimes  unclear.  N.A.  Lea  [Ref.  1:  p.  40]  has  defined 
voice  recognition  by  machine  "generally  as  the  process 
transforming  the  continuous  human  acoustic  signal  into 
discrete  representations  which  may  ne  assigned  proper 
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meanings  and  which  may  be  comprehended  to  affect  responsive 
behavior."  For  the  purposes  of  this  thesis,  the  process 
will  be  understood  as  the  conversion  of  human  speech  into 
recognizable  text,  i.e.  words  and  symbols.  Due  to  idiosync 
racies  of  the  human  voice,  as  exhibited  in  such  individual 
variations  as  sex,  race,  geographic  origin,  age,  emotion  an 
numerous  other  factors  which  impact  the  human  acoustic 
signal,  this  is  by  no  means  a  simple  task.  Machine  under¬ 
standing  cf  speech,  cn  the  other  r.and ,  is  a  closely  related 
activity  wnich  follows  recognition  and  applies  artificial 
intelligence  to  invoke  parsing  rules  and  to  make  logical 
inferences  from  tne  semantic  content  of  the  spoxen  (recog¬ 
nized)  message.  This  tasx  is  also  very  difficult  as  nuna; 
know  from  daily  experience  (not  sc  witn  recognition  which 
generally  taken  for  granted)  .  -Chile  speech  recognition  .a 
understanding  are  not  separate  functions  for  the  r.umar.  (we 
often  use  semantics  to  reconstruct/complete  a  sentence  we 
did  not  fully  hear  ox  listen  to),  they  nave  tended  as 
computer  technologies  to  develop  in  a  parallel  but  partly 
separated  fashion.  £Sef.  2] 

It  has  long  been  recognized  that  speech  is  the 
human's  highest  capacity,  most  natural  output  communication 
channel.  However  it  has  only  been  during  the  past  thirty 
years  tnat  it  has  been  possible  to  create  machines  which 
begin  to  take  advantage  of  this  fact.  in  terms  of  human 
input  to  computers,  the  keyboard  has  had  superior  speed, 
error  correction  capability  and  overall  versatility.  How 
long  the  keyboard  will  retain  tnis  superiority  is  open  to 
debate.  Already  commercial  word  recognizers  have  been 
effectively  used  for  such  jots  as  package-sorting  systems 
and  inspection  and  quality  control,  where  Keyboards  or 
numeric  keypads  served  before.  Military  uses  include 
cartography,  computer-assisted  training  of  air  traffic 
controllers  and  aircraft  cockpit  communications. 

[Ref.  3:  p.  29] 
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These  uses  oh  speech  recognition  technology  have 
benefited  iron  several  advantageous  properties  inherent  in 
speech  input  to  machines  in  audition  to  high  speed  and  natu 
ralness.  Automatic  speech  recognition  is  uni  gue  in  its 
ability  to  free  the  user’s  mind  and  eves  for  such  tasks  as 
viewing  graphics  screens  or  other  decision  aids  in  a  ccmmar. 
post,  overseeing  an  operations  center,  or  just  reading  from 
a  data  source  without  having  to  remove  the  eyes  to  find  and 
ensure  the  correct  key  is  being  struts.  While  skilled 
typists  can  read  from  a  data  source  and  input  data  at  a 
rapid  rate,  such  proficiency  is  not  achieved  without  a  good 
deal  ox  training.  By  comparison  training  in  the  use  of 
voice  recognition  equipment  can  be  minima^.  [Ref.  4] 

A  furtner  advantage  of  automatic  speech  recognition 
is  mobility.  With  a  lightweight,  wireless,  microphone  head 
set,  a  terser,  is  free  to  roam  about  and  attend  to  otr.er 
duties,  sucii  as  ar.  air  traffic  controller  monitoring  radar 
screens  and  speaking  simultaneously  to  pilots  and  machine, 
either  for  transcript  purposes  or  to  control  navigation  and 
landing  aids.  Finally  nacnine  access  can  be  controlled 
tnrougn  spoken  codeword  authentication  using  voice  recogni¬ 
tion  in  combination  with  a  speaker  verification  system. 
[Ref.  4] 

2.  Continuous  Voice  Re  coo  nition 

Historically,  dioost  all  commercial  applications  of 
voice  recognition  technology  have  fallen  in  the  category 
known  as  isolated  (discrete)  word  recognizers.  Typically 
this  class  of  speech  recognizers  has  demonstrated  the 
ability  to  recognize  limited  vocabularies  (up  to  300  words) 
where  tr.e  speaker  is  recuired  to  pause  perceptibly  retveer. 
each  word  or  utterance  (string  of  words  constrained  to  a 
specific  timeframe  such  as  1.5  or  2.0  seconds).  The  pauses 
provide  xcundaries  for  the  machine  processing  of 
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message  and  allow  the  machine  to  "eaten  up"  to  the  speaker. 
Discrete  speech  recognition  detracts  from  the  naturalness  of 
human  speech  and  imposes  an  artificial  constraint  or.  the 
speaker  which  requires  training  to  adapt  to  such  a  speaking 
mode. 

Sparked  by  the  five  year,  515  million  research 
effort  of  the  Department  of  Defense's  Advanced  Research 
Projects  Agency  in  the  aid- 1970's  known  as  Speers  Under¬ 
standing  Research  (Ar.PA  SJE),  the  recognition  of  continuous 
speech  was  proven  possible  in  the  laooratory.  As  other 
advances  in  microcomputer  and  memory  tecnnologies  came 
about,  the  first  commercial  continuous  voice  recognition 
products  have  come  on  the  market  in  tae  past  three  years. 
Khiie  speaker-dependence  and  limited  vocabulary  (300  or  so 
words)  are  still  the  rule,  clearly  naturalness  is  enhanced 
as  well  as  input  rate  when  continuous  speech  is  used.  "One 
can  continuously  speak  at  a  rate  of  about  150  to  300  words 
or  more  per  minute,  but  when  words  must  be  individually 
separated  by  pauses,  the  rate  arots  to  less  than  125 
(usually  tc  around  50  to  50)  words  per  minute." 

[Ref.  1:  p.  66] 

In  general,  continuous  voice  recognizers  rely  or  the 
definition  of  grammars  to  limit  the  number  of  words  from 
which  the  machine  must  choose  at  any  instant  based  on  previ¬ 
ously  recognized  words.  A  grammar  is  a  representation  of 
tae  allowable  word  sequences  in  a  state  diagram  composed  of 
nodes  representing  a  word  or  group  of  worms  and  the  possible 
transitions  between  the  nodes.  vihile  it  aas  been  shown  tnat 
finite  state  grammars  cannot  properly  caaracterize  major 
subsets  of  English  sentences,  unless  sentence  complexity  is 
severely  limited,  they  are  quite  appropriate  for  applica¬ 
tions  involving  strictly-formatted  sequences  of  words. 

[Ref.  1:  p.  52] 


B.  VEBBEX  3000  SPEECH  APPLICATION  DEVELOPMENT  SYSTEM 

(SPADS) 

Cne  product  in  particular  which  resulted  from  cor.rir.uou 
voice  work  of  the  1570’s  and  whicn  is  the  basis  for  this 
thesis  is  the  Vertex  3000  voice  ter~mal,  marketed  by  a 
subsidiary  of  Exxon  Corporation.  The  Vertex  3C3C  [Eef.  5; 
p.  8  ]  "is  a  continuous  speech,  voice  data  entry  terminal. 

It  can  operate  as  an  ir.put/output  peripheral  that  adds  voic 
entry  capability  to  ether  computer  systems  or,  ir.  some 
applications,  as  a  stand-alone  data  handling  system.  It  is 
designed  for  use  in  industrial  and  coaaercial  environeants 
with  either  high  or  lew  noise  levels,  and  allows  operators 
to  input  data  and  commands  in  a  naturally  spoken  stream  of 
numbers,  words,  or  phrases,  without  pausing."  With  its 
maximum  number  of  four  speech  processing  ncards,  the  Vertex 
3000  can  recognize  up  to  360  different  words  spread  over  as 
many  as  20  grammars,  i  A  finite  limit  to  grammar  size,  based 
on  total  number  of  words  and  complexity  of  the  node  tran¬ 
sition  network,  is  necessary  to  allow  the  device  to  remain 
"real  time"  in  terms  cf  computation  speed  and  aeaorv  (store 
voic  .  patterns)  requirements.  Thus  tee  total  application 
may  involve  up  to  360  words,  but  at  any  instant  the  recog¬ 
nizer  is  dealing  with  only  a  subset  (grammar) . 

The  Speeca  Application  Development  System  (SPADS)  is  a 
hardware/software  adjunct  to  the  Vertex  3000  voice  terminal 
which  allows  the  user  to  program  tae  voice  terminal  to  run 
particular  application.  In  other  words,  the  design  and 
definition  cf  grammars,  the  control  of  transitions  between 
grammars,  the  processing  of  text  output,  and  the  design  of 
the  terminal's  visual  and  aural  interface  (feedbacK)  with 


iTnis  information  was  obtained  at  a  SPADS  training 
course  attended  bv  the  author  and  conducted  by  dr.  Tnomas 
DiGennaro.  Vertex’ Senior  Software  Engineer,  7*-  9  November 
1583.  Suture  references  to  tnis  source  will  simply  be 
denoted  "SPADS  Training  Course". 
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the  operator  are  accomplished  through  SPADS.  Verbex  has 
designed  SPADS  so  that  Verbex  3000  users  can  write  their  own 
applications  and  update  tnem  as  required  vice  purchasing 
customer  engineering  services  from  Verbex.  Currently  in 
beta  test  status  (as  delivered  to  NPS}  ,  the  SPADS  is 
intended  to  be  user  friendly  such  that  the  building  of 
applications  and  grammars  is  done  through  menu-driven 
editors.  A  procedure  must  be  written  in  the  Pascal  program¬ 
ming  language  to  control  the  application.  Verbex  supplies 
about  20  predefined  functions  to  ease  thus  process. 

C.  8 AVAL  SABPAR2  INTERACTIVE  SIMULATION  SYSTEM  (NHISS) 

As  noted  earlier,  the  candidate  system  for  application 
of  continuous  voice  recognition  technology  was  the  NWISS 
developed  at  the  Battle  Group  Training  Computer  Support 
Facility  within  tne  Naval  Ocean  Systems  Center  at  Sac  Diego. 
Per  the  NWISS  user's  manual,  NWISS  "is  a  real-time,  man- 
interactive,  discrete  event,  time  step  computer-aided 
simulation  of  the  naval  warfare  environment.  .  .  .for  the 
purpose  of  supporting  the  training  of  senior  naval  officers 
in  force-level  tactical  decision  making  and  management  in 
command  and  control.  " 

In  addition  to  NCSC,  the  NWISS  is  resident  at  the  NPS 
Wargaming  Analysis  and  Research  laboratory  where  it  is  used 
primarily  to  introduce  wargaming  to  students  and  to  expose 
them  to  tactical,  force-level  decision  maxing  problems  in 
command  and  control.  The  NWISS  supports  a  two-sided  (Blue 
vs.  Crange)  scenario  in  wnich  opposing  sides  can  define, 
structure,  and  dynamically  control  forces  with  the  support 
of  an  umpire-like  function  called  Control.  Normally,  at 
NPS,  the  force  building  and  structuring  phases  are  accom¬ 
plished  by  the  instructor  and  the  students  begin  the  wargame 
phase  with  a  predefined  scenario  and  force  structure.  (The 
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most  often  used  (at  NPS)  set  of  scenarios  is  based  on  opera¬ 
tions  in  the  Sea  of  Japan.  Since  the  author’s  HWISS  experi¬ 
ence  was  as  a  Blue  player,  this  thesis  uses  the  3lue  force 
levels  and  names  from  the  Sea  of  Japan  scenario  set.) 

During  the  wargame  phase  the  two  sides  must  position  and 
eguip  their  forces  and  sensors  to  be  ready  for  whatever  the 
scenario  entails,  staying  within  tne  rules  or  engagement, 
and  to  engage  the  opposing  forces  in  combat  when  appropri¬ 
ate.  The  NFISS  can  support  various  "views"  of  the  action, 
representing  the  current  tactical  situation  itnown  to  a  user 
through  the  various  sensors  organic  to  his  controlled  force 
elements.  Thus  the  Blue  side  may  for  example  consist  of 
several  views  representing  different  warfare  commanders. 

Typically  a  player  has  at  his  disposal  an  alphanumeric 
display  capable  of  showing  various  information  status 
boards,  a  color  graphics  display  showing  force  positions 
(with  Naval  Tactical  Data  System  symbology)  and  sensor 
information  superimposed  over  a  map  of  the  area  of  opera¬ 
tion,  and  an  alphanumeric  terminal  for  entering  the  player 
commands.  via  keyboard,  the  player  may  enter  strictly- 
formatted  commands  to  change  the  graphics  display 
characteristics,  to  eguip  and  move  forces,  to  control 
sensors,  to  engage  enemy  forces  and,  in  general,  to  command 
and  control  the  battle  from  his  view.  The  alphanumeric 
terminal  can  also  be  used  to  send  and  receive  messages 
between  players  (views)  of  the  same  siie  and/or  Control. 

The  purpose  of  this  brief  description  of  NKISS  is  to 
provide  a  sense  of  the  system’s  overall  capability  and,  more 
importantly,  to  emphasise  that,  as  currently  configured  at 
NPS  (but  not  NOS C) ,  all  input  to  the  ill  I  S3  from  its  players 
is  via  keyboard.  Although  a  fair  degree  of  user  friendli¬ 
ness  has  been  incorporated  such  that  prompts  and  help  for 
entering  the  next  field  of  a  command  are  readily  provided 
and  only  enough  characters  need  be  typed  to  guarantee 
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uniqueness,  and  errors  are  easily  corrected,  command  entry 
can  still  be  a  laborious,  error  prone,  time  consuming  task 
for  many  users.  Precise  entry  of  force  names,  weapon  iden¬ 
tifiers,  latitude-longitude,  bearings,  ranges,  altitudes,  et 
cetera  is  required  and  errors  are  not  always  easily  forgiven 
by  N  KISS. 

D.  POBPCSE 

1 .  A  Past  S tu d2  of  Voice  S ecoqni ti on  A pplied  to 

Wargaming 

The  predecessor  to  NWISS  at  N’PS  was  the  Warfare 
Environmental  Simulator  (:-5S)  ,  also  developed  at  NOSC,  vith 
many  of  the  same  capabilities  as  NWISS  but  to  a  lesser 
degree  (particularly  in  system  response  time).  In  1981,  w. 
J.  .IcSorley  published  his  master’s  thesis  at  NPS  on  the 
subject  of  using  voice  recognition  technology  to  run  WES. 
Using  a  discrete  voice  recognizer  (Threshold  600)  ,  he 
compiled  a  set  of  typical  WES  commands  and  conducted  an 
experiment  with  12  subjects  of  varied  typing  abilities  and 
voice  (microphone)  experience  to  determine  which  input 
medium  was  superior.  Based  on  lengthy  statistical  analysis 
of  speed  and  error  results,  McSorley  [Bef.  6:  p.65] 
concluded  that  "the  subjects  were  aoie  to  input  WES  commands 
faster  and  with  fewer  total  errors  using  tne  manual  typing 
mode  than  with  voice  mode."  Since  McSorley's  subjects  were 
using  discrete  voice  and  had  an  average  typing  ability 
better  than  35  words  per  minute,  tne  results  are  not  too 
surprising. 

2 .  Thesis  Cbiec tives 

One  reason  for  summarizing  McSorley's  work  is  to 
show  that  interest  in  applying  voice  recognition  technology 
to  computer-aided  wargames  is  not  new.  Such  wargames  are 


16 


1 


highly  interactive  and  nence  invite  attempts  to  improve  the 
interaction  (and  the  participants  are  usually  willing 
subjects  in  new  or  experimental  undertakings) .  Given  the 
inherent  advantages  to  voice  input  stated  earlier,  it  seems 
only  natural  to  want  to  apply  speech  recognition  technology 
to  a  wargame  such  as  NWISS. 

The  more  cogent  reason  for  reporting  dcSorley's 
results  is  to  establish  the  grounds  for  this  thesis:  to 
make  use  of  the  progression  from  discrete  to  continuous 
voice  recognition  technology  and  build  a  voice  input  medium 
to  N W IS5  which  has  the  potential  to  compete  effectively  with 
the  keyboard  both  in  speed  and  error  results.  iJith  such  an 
input  medium,  NXISS  players  will  be  able  to  spend  their  time 
more  profitably  in  monitoring  tne  grapnics  display  and 
status  beards  and  in  commanding  and  controlling  their  forces 
with  more  natural  voice  commands  as  opposed  to  being  tied  to 
a  keyboard.  However  this  application  of  continuous  voice 
recognition  technology  is  not  intended  to,  nor  can  it, 
completely  replace  the  player’s  alphanumeric  keyboard. 

Patner  it  is  intended  to  substantially  improve  the  man- 
macnine  interface  for  Nfc'ISS  and  aliow  the  player  to  perform 
all  but  a  very  small  part  of  his  input  with  voice  vice  tne 
keyboard. 

Given  tne  commercial  state  of  the  art  (as  repre¬ 
sented  by  the  Verbex  3000),  tne  challenge  is  to  thoroughly 
scrutinize  tne  subject  application  (h'HISS)  and  so  design 
grammars  and  a  grammar  transition  network  in  software  such 
that  grammar  boundaries  (which  tend  to  be  "discrete")  occur 
in  places  where  natural  pauses  occur  or  where  the  user  can 
be  easily  induced  to  pause  with  minimal  disruption  to  speech 
patterns. 

Thus  tne  purpose  of  this  tnesis  is  to  show  that 
continuous  voice  recognition  technology  can  be  effectively 
applied  to  a  computer-aided  wargaLe  tnrough  demonstration  of 
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the  software  design  process  (Chapter  2)  and  the  user’s  oper¬ 
ating  instructions  (Chapter  3)  .  To  be  precise,  the  2HISS 
Elue  player  commands,  force  names,  weapon  identifiers,  et 
cetera  for  the  Sea  of  Japan  scenario  (totalling  about  150 
words  or  symbols)  have  been  placed  into  10  different  (but 
overlapping)  grammars  in  an  application  whicn  uses  about  800 
lines  of  Pascal  to  control  the  networx.  flow  between  grammars 
and  restructure  textual  output  to  NWISS  requirements.  The 
application  is  designed  to  be  user  friendly  and  require  a 
minimum  of  player  involvement  with  the  mechanics  of  the 
process. 
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II.  SOFTWARE  DESIGN 


As  noted  in  Chapter  1,  the  application  of  continuous 
voice  recognition  technology  requires  careful  study  of  tne 
man-machine  interactive  process  being  substituted  for  or 
supported.  In  the  case  of  NWISS,  such  questions  as  the 
following  must  be  answered: 

•  Where  will  natural  input  pauses  occur? 

•  What  feedback  or  prompting  will  the  user  require? 

•  What  means  must  be  provided  for  error  control? 

•  How  must  the  data  be  structured  for  the  host  computer 
and  what  special  data  characters,  if  any,  are  needed? 

•  How  will  the  overall  process  be  controlled,  particu¬ 
larly  in  the  time  domain? 

The  answers  to  these  and  other  related  questions  lead  to  the 
initial  design  of  the  grammars  and  the  application  control 
program.  This  chapter  will  define  the  NWISS  requirements  in 
terms  of  input  command  syntax  and  data  structures,  describe 
the  grammars  which  resulted  from  the  design  process,  and 
provide  some  insights  to  the  structure  of  the  application 
program  code. 

A.  NWISS  REQUIREMENTS 
1 .  Command  Synt ax 

There  are  approximately  37  commands  which  the  NWISS 
player  can  issue  to  forces  under  nis  control  for  maneuvering 
and  launching  platforms,  controlling  sensors,  and  engagement 
[fief.  7].  In  this  paper  these  are  referred  to  as  "FCS " 
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commands  of  which  about  23  are  used  with  any  frequency  at 
NPS  and  have  been  included  in  the  continuous  voice  recogni¬ 
tion  application  (see  Table  I)  .  There  are  approximately 
another  20  commands  an  NWISS  player  can  use  to  control  the 
graphics  display  characteristics,  obtain  bearings  and  accom¬ 
plish  ether  actions  [Ref.  7].  Again  these  20  commands  are 
not  all  used  at  NPS  and  so  the  10  commands  used  most  often 
have  been  included  in  the  voice  application  ana  are 
addressed  later.  The  primary  reasons  for  excluding  unused 
commands  are  grammar  and  application  code  efficiency  and 
reduced  user  training  time:  these  design  factors  and  others 
will  be  addressed  throughout  this  chapter.  shat  follows  now 
is  a  description  of  the  general  syntax  and  some  examples  of 
these  commands. 


TABLE  I 

Permitted  JiBISS  "FOR"  Commands 


ALTITUDE 

DECS 

MISSION 

SPEED 

2AP.RIEE 

DEFTtf 

ORDERS 

STATION 

BIN  GO 

E  SCON 

PERISCOPE 

SURFACE 

ELIF 

FIRE 

PROCEED 

TAKE 

COURSE 

LAUNCH 

R30C 

WEAPONS 

CC7EP 

LOAD 

RE  FU  EL 

a.  "FOF. "  Commands 

The  primary  NWISS  command  syntax  is  guite  simple 
and  consists  of  the  following  standard  format: 


FOP.  <addressee>  <command>  [  TIME  <start-minute>  ]  <CR> 


>•  • 
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where  the  following  conventions  apply: 

1.  Capitalized  words  are  command  keywords  which  may  be 
entered  in  abbreviated  form  (enough  letters  to  insure 
unigueness)  . 

2.  Lowercase  words  inside  parentheses  are  prompts  from 
NBISS  received  when  the  space  or  escape  character  is 
struck  after  the  preceding  field. 

3.  Lowercase  words  inside  arrows  are  command  arguments 
which  must  be  specified  precisely. 

4.  Keywords  and  arguments  inside  brackets  are  optional. 

5.  "FOE  <addressee>"  is  not  required  for  subsequent 
commands  directed  to  the  same  addressee. 

Some  specific  command  formats  and  examples  are  shown  in 


FCE  <addressee>  ALTITUDE  <ieet> 
"FCE  VA024  ALTITUEE  4000" 


FCE  <addressee>  CCV2E  <trac*  *> 

"FCE  MP604  CO VLB  FS002" 

FCE  <addressee>  PECCZED  POSITION  <latitade> 
<longitude> 

"FOR  Kii 0 X  PROCEED  POSITION  36-oON  134-55E" 

FOE  <adiressee>  FIRE  <r.uaber>  <naae>  TORPEDO  (at) 
<track  #> 

"FOE  CM A  HA  FIFE  2  MK48  TCEPEDO  (at)  3S006" 


Figure  2.1  Examples  of  "FOB"  Command  Syntax. 

Figure  2.1  where  <CR>,  carriage-return,  is  assumed  and 
"TIME”  is  net  used.  As  convention  *4  implies,  tne  keyword 
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"THE"  and  its  following  argument  are  optional:  most  player 
commands  are  entered  without  specifying  a  desired  game  time 
so  that  execution  by  NWISS  occurs  in  real  time. 

The  "addressee"  and  "force-name"  fields  have  the 
same  domain,  i.e.  all  legitimate  force-names  are  specified 
in  the  NWISS  force-building  and  scenario  definition  phases. 
For  the  Sea  of  Japan  scenarios,  the  addressable  Blue  forces 
number  9  ships,  1  shorefcase  and  potentially  101  aircraft 
(aircraft  do  not  have  callsigns  assigned  and  hence  are  not 
addressable  until  launched)  .  Aircraft  callsigns  consist  of 
2  alphabetic  characters  followed  by  3  digits  whereas  ship- 
names  and  shorebases  may  be  abbreviated  to  the  first  5 
characters,  e.  g .  RATHEourne.  In  addition,  tasK  force  desig¬ 
nators  may  be  used  to  address  collective  segments  of  the 
Blue  forces  or  individual  units  (not  aircraf t)  .  Thus  "FOR 
1.1"  catches  the  entire  Blue  task  group  including  all 
aircraft  while  "FOP  1.1. 0.0"  catches  the  Kittyhawx  rut  not 
her  aircraft.  Other  than  "1.1",  tnese  designators  are 
seldom  if  ever  used  and  are  in  a  sense  redundant.  For  tnat 
reason  t-lus  a  technical  problem  with  defining  periods  as 
part  of  object  names  in  the  Vertex  3030,  only  "1.1"  is 
permitted  in  this  application.  See  Tarle  II  for  all  allow¬ 
able  force-names. 

b.  "LAUNCH"  Command 

The  longest  and  most  difficult  "FOR"  command  to 
learn  how  to  enter  properly  to  NKI5S  is  that  for  launching 
aircraft.  Ability  to  correctly  launch  aircraft  is  of  course 
indispensable  to  game  play.  To  avoid  too  much  complication 
ar.d  be  in  conformity  with  how  the  command  is  most  often 
used,  its  simplest  syntax  will  be  discussed  and  is  shown 
below : 
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TABLE  II 

Sea  of  Japan  Scenario  Blue  Force-names 


] 


Ships 

Aircraft 

Snorebase 

KITTY  hawk 

MP  600-6  08 

MISAwa 

KNCX 

SH  100-103 

LGSANgeles 

V A  000-033 

Task  Group 

MCCO  Fraick 

VS  000 -003 

1.  1 

OMAHA 

VE  100 

RATHEourne 

VF  000-019 

SPRUAnce 

VH  000-005 

EICHIta 

7K  000-003 

WILSOn 

VS  000-009 

VT  000-003 

VW  000-003 

FOR  <addressee>  LAUNCH  <number>  <aircraf t-type>  (event  name) 
<name>  (course)  <degrees>  (speed)  <knots>  (altitude)  <feet> 

Upon  accepting  this  "first- level"  command,  NWISS  responds 
witn  the  prompt  "FLT  PLAN:"  on  a  new  line.  In  theory  the 
user  may  now  specify  any  of  alout  25  commands  applicable  to 
aircraft.  However  the  normal  NPS  practice  is  to  provide  a 
"MISSION"  for  the  aircraft,  "LOAD"  the  aircraft  with  expen¬ 
dables,  possibly  specify  a  "PROCEED  POSITION",  and  signify 
tne  completion  of  the  launch  command  witn  "STOP".  Except 
for  "SIOF" ,  any  of  the  other  commands  is  followed  by  the 
prompt  "FLT  PLAN:"  on  a  new  line.  Figure  2.2  snows  a 
complete  launch  command. 
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"FOE  KITTY  LAUNCH  6  F14A  (event  name)  F14A1  (coarse) 
210  (speed)  650  (altitude)  2000" 

FLT  PLAN:  "MISSION  CAP" 

FIT  PLAN:  "PROCEED  POSITION  36-30N  137-45E" 

FLT  PLAN:  "LOAD  (equipment)  2  PHENX  2  SPAR  2  SWDR" 
FLT  PLAN:  "STOP" 


Figure  2.2  Example  of  Complete  Launch  Command. 

c.  Graphics  Display  and  Other  Commands 

In  addition  to  the  "FOE"  commands,,  tho-re  is  a 
large  repertoire  of  commands  for  controlling  the  character¬ 
istics  of  tne  NWISS  graphics  output.  In  general,  the 
geographical  area  of  operation  can  be  centered  about  any 
force-name,  track,  or  position  specified  and  its  radius  can 
be  made  as  small  or  as  large  as  desired.  An  xmark,  circle 
or  grid  (set  of  concentric  circles  plus  12  lines  of  bearing 
spread  30  degrees  apart)  can  be  placed  over  any  force,  track 
or  position.  NWISS  generated  lines  of  nearing  (LOB)  for 
passive  sonar  or  ESN  (electronic  sensor)  may  be  erased  or 
turned  back  on.  Finally  there  are  other  non-graphics 
commands  for  executing  a  preplanned  launch  (the  Sea  of  Japan 
scenario  provides  five  "canned"  3xue  launch  plans  which 
allow  the  player  to  get  many  aircraft  up  at  once),  or 
obtaining  bearing  and  range  information,  or  overrriding  the 
NWISS  generated  NT DS  assignments  for  friendly,  neutral  or 
enemy  platforms.  Figure  2.3  shows  examples  of  some  of  these 
commands'  syntax  together  with  an  example. 


24 


i 


PLACE  (a)  CIRCLE  (around)  FORCE  <f orce- name> 

GRID  TRACK  Ctrack  #> 

POSITION  <lat>  <long> 

(radius)  <nautical-miles> 

"PLACE  (a)  CIRCLE  (around)  FORCE  MP604  (radius)  60" 


CENTER  (plot  at)  FORCE  <force-name> 

TRACK  < track  #> 

POSITION  <latatuie->  <longitude> 
"CENTER  (plot  at)  FORCE  KITTY" 


<CTRL-F>  <plan-name> 

" <CTE L-F >  F14STRCAP.PRE" 


BEARING  (and  range  from)  FORCE  <force-naae> 

TRACK  < track  i> 

POSITION  <iat>  <iong> 

(to)  FORCE  <f  orce-na  me> 

TRACK  <t rack  *> 

POSITION  <lat>  <long> 

"HEARING  (and  range  Iron)  FORCE  KNOX  (to)  TRACK  3SO04" 


DESIGNATE  (as)  FRIENDLY  <track  *> 
NEUTRAL 
ENEMY 

"DESIGNATE  (as)  ENEMY  3U007" 


Figure  2.3  Graphics  Display  and  Other  Player  Commands. 

2.  N£iss  Data  Requires ents 

As  shown  in  Figure  2.3,  execution  of  preplanned 
launches  is  accomplished  with  a  "control-!"  followed  by  the 
plan  name.  Of  greater  importance  is  the  "controi-k"  cnar- 
a<^ter  with  which  the  KKIS3  player  aay  cancel  any  connand 
prior  to  its  complete  entry  and  acceptance  by  NPISS.  There 
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are  word  and  character  erase  functions  also  in  NWISS  cut 
they  are  not  pertinent  to  a  continuous  voice  application 
which  outputs  buffered  strings  of  data  vice  keyboard  char¬ 
acter  by  character  output.  This  means  that  the  output  from 
the  voice  recognizer  must  be  at  least  syntactically  correct 
and  thus  in  conformity  with  the  syntax  examples  shown  above. 
In  general  output  strings  must  have  spatial  separation  of 
coonand  keywords  and  arguments  and  spaces  must  be  kept  out 
of  digit  strings  which  are  meant  to  be  contiguous  since 
NWISS  can  interpret  intervening  spaces  as  completion  of  the 
field  (e.g.  launching  an  Fla  at  5  vice  5000  feet).  Another 
requirement  is  to  signify  completion  of  command  entry  by  a 
<CR>  (carriage  return). 

B.  GRA3MAR  DESIGN 

1 .  Gvervie  w 

As  noted  earlier,  ten  grammars  have  beer,  defined  for 
the  NWISS  continuous  speech  application.  While  the 
str ictly-f  orna  t  ted  structure  of  the  N/7ISS  commands  combined 
with  the  Verbex  upper  limit  on  grammar  size  helps  to  deter¬ 
mine  the  overall  grammar  design,  there  are  numerous  factors 
to  consider  in  building  the  software  (both  grammars  and 
Pascal  procedure)  for  any  application.  According  to  Vertex 
[Ref.  8:  p.60],  the  grammar  design  goals  are: 

•  to  improve  recognition  accuracy 

•  to  allow  for  feedback 

•  to  allow  continuity  of  speecr. 

•  to  allow  natural  pauses 

•  to  reduce  response  time 

•  to  allow  error  correction 
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•  to  reduce  storage  requirements 

As  with  most  sets  of  goals,  there  are  some  incompatibilities 
among  then  and  tradeoff  decisions  must  be  made.  This 
chapter  section  will  briefly  describe  tne  Verbex  tools  used 
to  define  grammars,  describe  the  major  grammars  built  for 
NWISS  in  detail,  and,  wherever  appropriate,  discuss  grammar 
design  goals  and  tradeoffs  vis-a-vis  tne  HFISS  application. 

2.  Verbex  Grammar  Design  Tools 

a.  Verbex  Standard  Notation  (VS'J) 

In  a  takeoff  on  Bacxus- Na ur  Form  (3NF) ,  Verbex 
has  created  a  very  logical  and  understandable  means  for 
defining  grammars  which  must  first  be  described  in  order  to 
define  the  NNISS  grammars.  The  basic  element  of  VS K  is  the 
object,  which  may  be  eitaer  simple  or  complex.  A  simple 
object  is  a  word  that  the  user  actually  speaks.  A  compound 
object  is  a  category,  or  group  of  objects,  and  is  so  denoted 
by  placing  a  period  in  front  of  it.  Note  that  a  compound 
object  can  represent  a  group  of  compound  objects  as  well  as 
simple  objects.  within  a  compound  object  definition,  alter¬ 
native  objects  are  arranged  vertically  and  consecutive 
objects  are  arranged  horizontally.  Thus  to  define  a 
compound  object  which  is  used  in  every  N'7ISS  grammar,  we 
write 

.  digi  t  : :  =  0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

to  represent  the  fact  that  any  one  of  the  ten  digits  may  fce 
spoken  wherever  .digit  appears  in  a  higher  level  compound 
object  or  grammar  definition.  Thus 
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.course  ::=  .digit  .digit  .digit 

could  suffice  as  the  VSN  definition  of  a  compound  object 
used  frequently  in  NWISS.  It  may  be  noted  that  sometimes  a 
course  may  be  specified  with  only  1  or  2  digits.  VSN 
defines  optional  objects  with  brackets.  Hence  a  better 
definition  cf  course  might  be 

.course  ::=  .digit  [.digit]  [.digit] 

to  allow  such  variation.  However  this  definition  of 
"course"  implies  a  time  constraint:  the  Verbex  3000  will 
consider  the  above  string  to  be  complete  after  the  first 
digit  plus  a  half-second  of  silence.  Tnus  the  user  needs  to 
know  this  so  that,  if  a  course  is  to  be  specified  with  3 
digits,  it  is  spoken  as  a  continuous  string  witnout  pauses. 
The  tradeoff  here  is  whether  users  must  enter  "course"  as  3 
digits  always  (in  which  case  the  Verbex  3000  waits  until  its 
timeout  tnreshhold,  approximately  one  minute,  to  receive  all 
3  digits)  or  whether  they  can  enter  "course"  as  1,  2,  or  3 
digits  with  no  intermittent  pauses. 

Finally  VSN  allows  an  unspecified  amount  of 
repetition  of  the  same  object  through  use  of  the  "♦".  In 
NKISS,  aircraft  altitudes  can  vary  by  several  digits  in 
length.  Thus  the  following  definition 

.altitude  .digit-*- 

accomplishes  the  sane  task  as  several  consecutive,  brack¬ 
eted,  identical  objects.  Again,  however,  there  is  the  same 
tine  constraint  imposed  on  the  user  as  amove  with  brackets: 
no  intermittent  pauses.  On  the  other  aand,  a  user  could 
speak  digits  continuously  up  to  the  Verbex  3000  buffer  licit 
of  258  digits2  with  the  above  definition. 


2SPACS  Training  Course 
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According  to  Verbex  [Ref.  8:  p.  64],  "a  gratae 
is  coinglete  when  every  compound  object  included  in  its  defi¬ 
nition  has  teen  completely  defined  in  terms  of  simple 
objects.  There  is  no  limit  to  the  numoer  of  levels  this 
definition  can  take,  nor  on  the  scope  of  complexity  of  any 
level."  To  see  how  this  definition  applies,  the  VS!l  defini¬ 
tion  of  the  NWISS  grammar  "position"  is  shown  in  Figure  2.4. 
This  grammar  is  confusing  at  first  glance  since  it  seems  to 
provide  for  botn  latitude  and  longitude  tut  without  speci¬ 
fying  vuich.  That  is  precisely  the  job  of  tne  application 
code:  it  werks  in  cohesion  with  the  grammar  design  and 

calls  Position  twice  for  recognition,  the  first  time  with 
the  prompt  "LATITUDE"  and  the  second  time  with  the  prompt 
"LONGITUDE".  Here  grammar  efficiency  is  achieved  because 
tne  application  code  takes  advantage  of  the  natural  pause 
which  occurs  between  latitude  and  longitude  expressions. 


Position 


*1]  .digit  .digit  [.minutes]  .direction 
CONTROL  K 


•  minutes 
.  direc  tior. 


.  6digit 


-  .odigit 

N 

S 

TT 

V. 

0 

1 

2 

3 

4 

5 


digit 

digit  : : =  0 
1 
2 

3 

4 

5 

6 

7 

8 
9 


Figure  2.4  NHISS  Position  Grammar. 
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With  that  understanding,  we  may  note  that 
Position  has  an  optional  complex  object,  .minutes,  which  the 
user  is  required  to  begin  with  the  sentinel  "dash".  Since 
NWISS  requires  the  dash  from  the  keyboard  as  well,  this  aces 
not  seem  burdensome.  Less  complexity  and  greater  accuracy 
are  achieved  with  the  object  .6digit  defined  as  part  of 
.minutes. 

Hote  also  that  CONTROL_K  is  an  alternative  in 
tnis  grammar  as  well  as  ail  other  NWISS  grammars:  the  user 
may  cancel  tne  NWISS  command  at  any  ^oint.  llore  rationale 
for  the  CON TR3 1_ K  will  appear  in  tne  discussion  of  the 
grammar  Nwisgraml.  Further,  the  Verbex  will  output  N,  S,  Z, 
or  W  in  accordance  with  the  voice  signals  which  the  user 
trains  for  those  symbols.  In  other  words,  tne  user  may 
train  and  speak  these  as  "North",  "Soutn",  "last"  and 
"West".  Finally,  note  that  the  grammar  will  not  prevent 
incorrect  expressions:  one  could  easily  say  "12  3  N "  for 
latitude  or  "95S"  for  longitude.  The  primary  purpose  of 
grammar  definition  is  to  define  what  can  be  said  legally. 
Preventing  illegal  expressions  is  a  "side"  benefit  which,  if 
pursued  too  far,  can  cost  too  much  in  complexity,  a  subject 
cf  the  next  section. 

i.  Verbex  Grammar  Editor 

The  Verbex  3000  SPADS  aas  a  menu-driven 
facility,  called  GF IP,  for  creating  grammars  which  is  basic¬ 
ally  user  friendly.  With  GRID,  the  designer  inputs  poten¬ 
tial  grammars  for  an  application  in  V5N  form,  pushes  the 
SPADS  "application  generate"  function  key,  and  sits  tack. 

The  result,  if  the  grammars  are  not  too  complex,  is  a 
complete  application  less  the  Pascal  control  code.  In  other 
words,  SPADS  automatically  interprets  the  GEID-built  VSN 
file  and  builds  recognition  instructions  for  the  Verbex  3000 
telling  it  when  and  where  to  look  for  acoustic  input 
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including  long  and  short  silences  for  the  grammars  in  that 
application.  It  also  creates  ail  the  files  necessary  for 
user  voice  training  and  testing  of  the  gramaars. 

F.egarding  complexity,  SPADS  generates  a  report 
for  eacn  grammar  which  states  the  nuaner  of  individual  words 
in  that  grammar,  the  percentage  of  machine  vocabulary 
capacity  that  total  represents,  and,  most  importantly,  a 
determination  of  the  grammar’s  complexity  expressed  as  a 
percentage  of  machine  capacity.  Complexity  is  cased  on  noth 
number  of  words  and  number  of  node  transitions  in  the 
grammar.  As  this  complexity  percentage  nears  100,  the 
Verbex  3000  ability  to  remain  "real  time”  is  reduced 
[Ref.  5:  pp.  72-75].  However  SPADS  generally  (in  the 
author's  experience)  will  not  generate  a  grammar  with 
complexity  higher  than  901  for  the  maximum  capacity  model 
3000  (four  speech  processing  boards).  The  formula  used  by 
Verbex  tc  compute  complexity  is  too  cumbersome  to  describe 
here  and  is  fully  explained  in  [Ref.  8].  However  there  are 
tnree  important  factors  in  the  computation: 

1.  Total  numrer  of  distinct  simple  objects  or  words  ir. 
the  grammar; 

2.  Total  number  of  words  that  may  occur  as  the  first 
word  of  a  legal  path  through  the  grammar; 

3.  The  average  length  of  all  possible  paths  through  the 
grammar . 

To  provide  some  yardstick  measure  of  complexity  for  compar¬ 
ison  with  other  gramaars,  the  report  for  tne  grammar 
Eos it  ion  is: 

Total  vocanulary  is  16  words 
Vocabulary  is  52  of  capacity 
Complexity  is  4  1%  of  capacity. 

The  average  path  length  is  the  major  factor  in  tne 
complexity  of  the  Position  grammar.  Most  of  the  NWISS  gram¬ 
mars  fall  in  the  40  to  601  range  for  complexity. 
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3 .  NWISS  Grammars 
a.  Nwisgraml 

The  first  grammar  from  which  the  Verbex  3000 
attempts  to  recognize  NWISS  commands  is  Nwisgraml.  This 
grammar  is  called  in  for  recognition  by  the  Pascal  applica¬ 
tion  code  as  soon  as  the  previous  command  has  been  output  to 
NWISS.  Its  purpose  is  to  allow  the  user  to  begin  ar.y  legal 
NWISS  command  and  then,  based  on  tne  recognition  result, 
allow  the  application  code  to  call  in  the  next  appropriate 
grammar.  The  number  of  ways  that  one  can  begin  a  command 
are  numerous: 

FOR  <addressee> 

Graphics  command 

Other  non-graphics  command 

"FOR"  command  without  "FOR  <addressee>" 

(when  directed  to  same  addressee  as  before) 

Based  on  trial  and  error  experience,  tnese  requirements  are 
too  large  to  be  nandled  by  any  single  grammar  or.  the  Verbex 
3300.  Using  two  or  more  grammars  will  not  solve  the  problem 
since  at  any  instant  only  one  can  be  recognized  and  there  is 
r.o  prior  knowledge  as  to  which  is  needed.  The  only  possiole 
solution  is  a  compromise  with  the  requirements: 

1.  Eliminate  the  possibility  of  beginning  a  "FOR" 
command  without  the  "FOR  <aidressee>".  Inis  seems 
only  a  slight  inconvenience  to  tne  user. 

2.  Create  a  sentinel  for  graphics  commands  so  that  a 
separate  grammar  can  be  called  in  upon  recognition  of 
the  sentinel.  This  was  done  using  the  sentinel  word 
"display".  Hence  the  user  must  say  "display"  and 
pause  for  a  half-second  before  entering  any  graphics 
commands.  This  is  an  inconvenience  but  in  practice 
the  author  had  little  difficulty  in  adapting  to  it. 
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3.  A  third  possibility  is  to  divorce  "FOR”  from 

<addressee>  so  that  the  user  is  required  to  pause 
after  "FOR"  while  the  application  loads  a  grammar 
containing  all  the  Blue  force-names.  This  was  tried 
and  proved  difficult  as  it  violates  the  rule  of 
placing  grammar  boundaries  where  natural  pauses 
occur.  "FOR  KITTY"  is  much  more  natural  than  "FOR", 
pause,  "KITTY". 

Another  design  question  was  how  to  handle  the 
"control_f"  for  executing  predefined  Blue  launch  plans. 
Should  the  user  say  "cor.trol_f"  or  some  more  meaningful  word 
such  as  "execute"?  The  latter  alternative  was  chosen  as 
being  easier  to  associate  with  plans  and  not  being  confused 
with  the  other  control  character,  "  controi_k" ,  used  for 
cancelling  commands  in  mid-stream.  This  character  is  quite 
prominent  in  use  in  NKISS,  has  a  distinctive  acoustic 
pattern,  and  is  shorter  than  some  phrase  sucn  as  "cancel 
command".  Hence  "control_k"  is  used  as  it  looks.  In  both 
cases,  the  application  code  converts  tne  recognized  string 
to  the  proper  ASCII  output  character  for  SWISS. 

Similarly,  the  application  coue  can  make  the 
user's  task  easier  by  not  requiring  "pee"  to  be  said  after 
each  plan  name.  Thus  the  plans  are  specified  as  A6STRIK2, 
BCAP1,  et  cetera  and  the  application  code  ta^es  the  recog¬ 
nized  plan  name  and  attaches  ".PRE"  to  it  as  required  for 
SWISS. 

The  requirements  for  the  liwisgraml  grammar  have 
been  refined  well  enough  that  the  grammar  can  be  specified 
as  in  Figure  2.5.  AS  noted  earlier,  objects  sucn  as  "1.1" 
cannot  be  entered  into  GRID.  The  period  is  illegal  except 
as  the  first  character  of  a  complex  object.  Hence  the 
application  code  must  convert  "Ipoiatl"  to  "1.1"  for  output 
to  SWISS.  Another  application  code  conversion  occurs  with 
.aircraft  because  the  Vertex  3000  outputs  a  space  between 
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Nwisgr  an  1 


::=  FOR  .force 
DISPLAY 

EXECUTE  .preplan 

DESIGNATE  .status 

CONTROLJX 

BEARING  TRACK 

3EARING  FORCE 

FORCE 

TRACK 

POSITION 


status  ::  = 

force  ::= 

FRIENDLY 

NEUTRAL 

SNESY 

1 poi ntl 
.  aircraft 
.  ship/hase 

.preplan  :: 

.aircraft: :=  M?6 

0 

A6SIRIKS 

BA IE  ASH 

3CAP1 

3CAP2 

FI  4SIRCA? 

. digit 

shiF/base 

: :=  KITTY 

SH  1 

0 

.digit 

KNOX 

VAO 

• 

digit  .digit 

1CSAN 

VEO 

0 

.  di uit 

MCCOK 

VS  1 

0 

0  ' 

a  ISAM 

VFO 

digit  .digit 

0  NAHA 

VHO 

0 

.digit 

RATHE 

VKO 

0 

.digit 

SFFUA 

VSC 

0 

. digit 

W  ICHI 

VI  0 

0 

•digit 

KILSO 

VEO 

0 

. digit 

Figure  2.5  NiISS  Nwisgraal  Granaar. 

each  object  it  recognizes.  Hence  "KP604"  is  place!  in  a 
buffer  as  "d?c  C  4"  where  tea  prograe  then  reaoves  the 
offending  spaces  prior  to  output  to  MiilSS.  Too  xany  conver¬ 
sions  such  as  "1.1",  ".PEE"  and  aircraft  callsigns  add  to 
tne  length  and  conplexity  of  the  application  code  and  poten¬ 
tially  can  slow  the  real  ti ae  capability  of  tne  Vertex  FIDO. 
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Hence  the  effort  should  be  made  to  place  objects  in  grammars 
exactly  as  they  are  to  be  output  where  possible.  The  SFADS 
report  fcr  Nwisgraml  is: 

Total  vocabulary  is  49  words 
Vocabulary  is  1 4*  of  capacity 
Complexity  is  48%  of  capacity. 

Here,  by  comparison,  the  complexity  is  somewnat  nigher  than 
that  (41%)  for  Position,  due  to  the  driving  factor  of  total 
number  of  words. 


b.  Force,  Track,  and  Position 


The  last  three  objects  of  llwisgraal's  top  level 
definition  are  not  just  objects  but  also  the  names  of  three 
individual  grammars.  Their  appearance  in  Nwisgraml  is 
necessitated  by  the  syntax  of  the  "BEARING"  command  as  shown 
in  Figure  2.3.  Because  FORCE,  TRACK  and  POSITION  are  used 
often  as  command  keywords  and  the  size  of  their  respective 
argument  domains  precludes  lumping  them  together  or  inside 
some  other  grammar  and  the  argument  can  be  made  that  there 
is  a  natural  pause  after  these  keywords  but  before  specifi¬ 
cation  cf  their  arguments,  they  have  been  specified  as 
individual  grammars.  Since  Position  has  already  been  exam¬ 
ined  (see  Figure  2.4),  the  Force  and  Trace-  grammars  will  he 
described. 


Force  :  :  =  1 pointl 

.aircraft 
. shiF/base 
. orngtases 
CONTROL  K 


.  orngoases 


PC' NS  A 
ALZK3 
PSTF.O 
VLAD 


The  Force  grammar  is  guite  similar  to  the  complex  object 
.force  in  Nwisgraml.  The  only  difference  is  the  addition  of 
.orngtases  because  NWISS  allows  the  Blue  player  to  request 
bearings  on  Orange  bases  Ly  name  as  well  as  position  for 
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convenience.  Hence  the  redundance  between  Nwisgraml  and 
Force  is  considerable  but  necessary:  one  grammar  can  not  be 
devised  to  do  the  jot  of  both.  Tnis  grammar  overlap  has  a 
cost  in  terms  of  storage  space  and  user  training  time  but  is 
unavoidable.  The  SPAES  report  for  Force  is: 

Total  vocabulary  is  37  words 
Vocabulary  is  1 153  of  capacity 
Complexity  is  58"  of  capacity. 

The  number  of  initial  path  words  (27)  is  the  major  factor  in 
this  complexity  figure. 

The  Track  grammar  is  relatively  small  and  simple 
but  is  called  quite  oxter: 

Track  ::=  E  AO  .digit  .digit 
EEO  .digit  .digit 
BPO  .digit  .digit 
ESO  .digit  .digit 
EUO  .digit  .digit 
CONTROL _K 

The  SPADS  report  for  Track  is: 

Total  vocabulary  is  15  words 
Vocabulary  is  5%  of  capacity 
Complexity  is  24^  of  capacity. 

To  see  hew  the  application  code  comes  into  play, 
consider  the  SEARING  command:  when  the  user  says  "E2ARIM3 
FORCE",  for  example,  a  complete  path  tarough  Nwisgraml 
exists  and  the  result  is  placed  in  a  buffer.  The  applica¬ 
tion  code  analyzes  the  buffer  contents  anu  determines  that 
the  buffer  contents  should  be  sent  unenanged  to  U7ISS  with  a 
space  after  "FORCE",  then  calls  in  grammar  Force  for  recog¬ 
nition,  checks  the  contents  again  for  necessary  conversions, 
outputs  the  converted  string  with  a  trailing  space,  returns 
to  iiwisgranl  to  see  what  the  next  xeyword  in  the  command 
will  be  (a  choice  of  FCECZ,  7FACK,  or  POSITION)  ,  outputs 
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this  keyword  followed  by  a  space,  calls  in  the  keyword- 
specified  grammar,  performs  any  necessary  conversions  and 
outputs  the  string  followed  by  a  <CR>. 

c.  Display 

As  previously  discussed,  Nwisgraal  contains  tne 
sentinel  "DISPLAY"  to  allow  tne  application  control  software 

- 1 

Display  :  :=  RADIUS  .digit* 

DROP  TRACK 
CENT  ER  FORCE 
CENTER  POSITION 
PLACE  -  what  .where 
CANCEL  -what  -where 
PLOT  LOB  ESN 
PLOT  LC3  SONAR 
ERASE  LOB  ESfl 
ERASE  LC3  SONAR 
CGNT  ROL_K 

-what  ::=  CIRCLE  -where  ::=  FORCE 

GRID  TRACK 

XMARK  POSITION 

ALL 


Figure  2.6  NWISS  Display  Grammar. 

to  call  in  the  grammar  with  that  name  snown  in  Figure  2.6. 
The  SPA IS  report  on  Display  is; 

Total  Vocabulary  is  23  words 
Vocabulary  is  3?  of  capacity 
Complexity  is  497  of  capacity. 
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Airorders 


::=  ALTITUDE  .digit* 
3ARRIER 
BINGO 
CON  TRCL_K 
COURSE  .digit* 

COV  EE 

EMC  ON  .plan 
FIR  E 

MISSION  .mission 
ORDERS 

PROCEED  POSITION 
REFUEL  VKGO  . 4digit 
SPEED  .digit* 

TAKE 

WEAPONS  FP.EE  .how 
WEAPONS  TIGHT 


•  mission  : :  =  AE>: 

AIETANXEH 

ASS 

CAP 

SEARCH 

SI  RCA? 

STRIKE 

SURCA? 

.  how  :  :=  AIR 
ALL 

ENEMY  . enemy 

SUBMARINE 

SURFACE 


.  plan 


A  SN 
AIRS 
h  ADI  A 
S I L  E  N 
SONAR 
SURF 


. 4digit 


0 

1 

2 

3 


.  aneay 


AIR 


ALL 


SUBMARINE 

SURFACE 


Figure  2.7  NWISS  Airorders  Grammar 


d.  Airorders  and  Shiporder 


These  two  grammars  are  the  largest  and  most 
complex  developed  for  NWISS.  The  difference  of  singular  vs. 
plural  in  the  names  is  due  to  GRID  not  accepting  grammar 
names  longer  than  nine  characters.  As  the  names  indicate, 
there  is  sufficient  difference  between  the  types  of  orders 
that  aircraft  and  ships  receive  to  justify  individual  gram¬ 
mars  for  reasons  of  reduced  complexity  and  increased 
accuracy.  Cr.e  of  these  two  grammars  is  called  from 
Nwisgraal  every  time  "FOR  <addressee>"  is  recognized  (both 
1.1  and  SISAW  call  Shiporder).  Figure  2.7  contains 
Airorders.  Here  is  the  only  instance  of  command  arguments 
being  excluded  from  a  grammar  because  of  lacK  of  use  at  NFS 
and  need  for  efficiency:  there  are  actually  14  possible 
aircraft  missions  in  NWISS  tut  only  3  have  been  included  in 
Airorders. 

Figure  2.8  contains  Shiporder.  Comparison  with 
Airorders  shows  that  the  two  grammars  have  nine  commands  in 
common.  SPEED  is  defined  differently  in  Shiporder  because 
ship  speeds  can  be  narrowly  defined.  If  ships  ever  go 
faster  than  39  knots,  the  grammar  must  be  changed.  The 
SPADS  report  for  Shipcrder  is: 

Total  Vocabulary  is  42  words 
Vocabulary  is  12^  of  capacity 
Complexity  is  53*  of  capacity. 

e.  Launch 

3ecause  of  its  difficult  and  lengthy  syntax  (see 
Figure  2.2),  and  a  large  vocabulary  requirement  due  to  the 
large  number  of  aircraft  types  which  can  be  launched  and  the 
large  number  of  expendables  which  can  oe  loaned  on  the 
aircraft,  the  LAUNCH  command  merits  its  own  grammar.  The 
Launch  grammar  car.  only  be  called  from  Shiporder  when  that 
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Shiporder  ::=  BLIP  .on/off 
CONTROL^ 

COU RSE  .digit* 

DEC  M  .on/off 
DEPTH  .digit* 

EMC  ON  .plan 

FIR  E 

LAUNCH 

ORDERS 

PER  ISCOPE 

PROCEED  POSITION 

R30C  .OR/off 

SPEED  .digit 

SPEED  .3dig  .digit 

STA  TIC  N 

SURFACE 

TAKE 

PEA  PONS  FREE  .now 
PEA  PONS  713 HI 


on/off 

::=  ON  .3digit  ::=  1 

OFF  2 

3 

. plan  : : = 

AEa 

AIRS 

F  ADIA 

how  :  = 

AIR 

ALL 

SILEN 

SONAR 

ENEMY  .enemy  .enemv  :  :  = 

SUBMARINE 

SURFACE 

AIR 

ALL 

SU  3MA  BIN E 
SURFACE 

SURF 

Figure  2.8  NWI SS  Shiporder  Grammar. 

command  is  stated  by  the  user.  As  noted  earlier,  only  a 
simplified  version  of  the  command  syntax  is  supported  in 
this  application  and  hence  the  grammar  is  smaller  than  night 
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XSL. 


Launch 


: :=  .digit  .  acfttype  .acfttype  .digit 
.digit  [.digit]  .wpnsload 
CCNTRCL_K 
LOAD 

PECCEED  POSITION 
STOP 


.acfttype  : :=  A6E 
A7E 
E2C 
EA3B 
EA6B 
F  14A 
F  14T 
KA6D 
P3C 
S3A 
SK2F 
SH3H 

SW  DR 
WALLI 


wpnsload  : 


=  HAF.P 
MK46A 
MK  82 
.IK 3  3 
MK  84 
PHENX 
SHEIK 
SPAE 
S0538 
SSQ47 
SSQ53 
SS  06  2 


Figure  2.9  NWISS  Launch  Grammar. 

otherwise  be  required.  For  instance,  the  <event-name>  argu¬ 
ment  can  only  be  specified  as  the  repeated  aircraft  type 
plus  a  single  digit.  The  choice  of  "FLT  PLAN"  commands  is 
limited  to  MISSION,  LOAD,  PROCEED  POSITION,  and  STOP.  Here 
is  an  instance  where  the  application  program  is  used  to  save 
grammar  storage  space  and  complexity:  the  MISSION  command 
and  its  eight  arguments  are  not  part  of  Launch.  Instead  the 
application  code  calls  in  Aircrders  for  recognition  at  that 
point,  provides  a  display  message  to  the  user  (feedback  and 
guiaance  to  users  is  described  more  fully  in  the  next 
chapter)  asking  for  the  mission,  outputs  accordingly  and 
returns  to  the  Launch  grammar  for  the  next  recognition.  The 
same  is  true  of  the  course,  speed  and  altitude  arguments  for 


LAUNCH  wherein  a  "utility"  grammar#  Digits  (exactly  what  its 
name  implies),  is  called  in  three  times  with  a  different 
display  prompt  each  time. 

The  implication  above  is  that  grammar  design  and 
application  program  design  are  intertwined  so  that  consider¬ 
ation  cannot  be  given  one  without  the  other.  A  further 
implication  is  that  the  user  is  forced  to  follow  the  appli¬ 
cation  program  sequence  of  directions  in  entering  the  launch 
command.  From  one  viewpoint  this  simplified  syntax  and 
controlled  sequence  cf  input  eases  the  user's  job.  From  a 
different  point  of  view,  it  costs  the  user  in  terms  of  the 
flexibility  afforded  by  the  keyboard.  (Of  course  there  is 
always  the  option  to  increase  grammar  and  program  complexity 
to  provide  sucn  flexibility.)  The  SPAD5  report  for  Launch 
is: 

Total  vocabulary  is  41  words 
Vocabulary  is  1 2%  of  capacity 
Complexity  is  40*  of  capacity 

f.  Fire 

The  FIRE  (TORPEDO)  command  syntax  appears  in 
Figure  2.1.  In  addition  to  torpedoes,  SWISS  players  can 
FIRE  cruise  missiles  using  the  following  syntax: 

FIRE  <number>  <name>  CRUISE  (missile) 

AT  <shore-base> 

HEARING  <dejrees>  RANGE  <nautical-miles> 

"FCB  SPF.UA  FIRE  3  HEPON  CRUISE  (missile)  AT  ALZKS" 

The  Fire  grammar  can  be  called  from  eitner  Airorders  or 
Shiporder  and  appears  in  Figure  2.10.  Although  the  FIRE 
command  is  much  less  complex  than  LAUNCH,  it  still  has  a 
large  enough  vocabulary  requirement  to  merit  a  _  ..arate 
grammar.  As  with  LAUNCH,  the  FIRE  (CRUISE)  command  is 
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fcroken  into  parts.  However  the  application  code  does  not 
branch  to  other  grammars  but  simply  recalls  Fire  until  tne 
command  is  complete.  The  SPADS  report  for  Fire  is: 

Total  vocabulary  is  26  words 
Vocabulary  is  8%  of  capacity 
Complexity  is  42S  of  capacity. 


Fire  ::  =  .digit  .cruistype  CRUISE 
.digit  .torptype  TORPEDO 
AT  .base 
BEARING  .digit* 

RANGE  .digit* 

CONTROL  K 


cruistype  ::= 
.base  ::  = 


TCSHK 

HRPON 

A1EKS 

RETRO 

7LAD 

ViCNSA 


torpt  ype 


:=  A  SP.OC 
K46 
MK46A 
HK48 


Figure  2.10  NWISS  Fire  Grammar. 


C.  PASCAL  PROCEDURE  DESIGN 

Numerous  references  to  the  "application  code"  in 
preceding  sections  have  already  indicated  what  its  purposes 
are: 

•  to  provide  for  the  control  of  the  interactive  process 
between  the  user  and  the  Verbex  3000 

•  to  control  data  output  to  the  host  process,  NWISS. 


43 
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Hence  calling  the  correct  grammars  in  sequential  order  is 
not  enough:  feedback  must  be  provided  the  user  so  that  he 
knows  the  machine’s  status  at  all  times.  In  part  this  is 
accomplished  automatically  by  indicator  lights  on  the  Vertex 
3000  Oser’s  Console.  In  part  it  is  accomplished  by  the 
system  response  of  the  host  process  to  which  the  user  is 
inputting  commands.  Finally,  with  regard  to  the  subject  at 
hand,  it  is  also  accomplished  in  part  by  the  visual  and 
aural  massages  which  the  application  program  generates  to 
the  user  through  the  User's  Console.  (No  aural  feedback  is 
used  in  the  KWISS  application)  . 

Appendix  A  of  this  thesis  contains  the  approximately  500 
lines  of  Pascal  code  used  to  control  tne  II KISS  continuous 
voice  application.  This  chapter  section  will  describe  tne 
Verbex  predefined  functions  which  appear  repetitively 
throughout  Appendix  A  and  predefined  types  and  explain  the 
reasons  underlying  the  programming  techniques  used. 

1 .  Verbex  Predefined  Functions  and  Types 

It  was  noted  in  Chapter  1  that  Verbex  nas  created  a 
library  of  about  20  predefined  functions  to  ease  the 
programmer's  task.  However  the  SPADS  is  still  in  beta  test 
status  and  many  of  these  functions  do  not  work  yet,  though 
they  are  defined.  Of  those  that  work,  only  three  are  used 
in  the  MISS  application  and  are  defined  below: 

1 .  Beccgnize  (urac.marr.aae,  buf fername)  is  the  workhorse 
function.  It  tells  the  Vercex  3000  to  begin 
listening  for  acoustic  signals  matching  the  named 
grammar  and  to  place  the  output  result  in  the  named 
buffer  (of  type  "string").  All  Verbex  functions 
return  a  value  of  type  "short"  to  indicate  success  or 
failure  or  some  other  appropriate  result.  Recognize 
can  return  four  values:  1)  yoicein  means  success 
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2)  recognize-bad  means  failure  3)  timeout  means  no 
input  which  could  be  classified  as  recognizable  or 
unrecognizable  was  received  for  approximately  one 
minute  4)  ke vcadin  means  the  recognition  cycle  was 
interrupted  by  the  user  from  the  User's  Console 
keypad. 

2.  Hcstwrite  (buffername)  tells  the  Vertex  3000  to  write 
the  contents  cf  the  named  buffer  (string)  to  the  host 
computer  (SWISS  in  this  case)  and  simply  returns 
success  or  failure. 

3.  Displaymessageclear (display  primary,  message)  tells 
the  Vertex  3000  to  write  the  message  (of  type  string) 
to  the  32  character  display  on  the  User's  Console  and 
simply  returns  success  or  failure. 

Seme  other  functions  were  investigated  but  found  not 
to  work.  This  was  unfortunate  as  it  made  the  programming 
task  for  MISS  a  rather  tedious  one  in  terms  of  comparing 
and  manipulating  character  strings  character  by  character. 
Word  count  (string)  .  Wordfind  (string)  ,  Stringcopy  (string!. 
string2)  .  and  Stringcomoare  (string!.  stnag2)  are  fairly 
descriptive  names  of  functions  which  are  defined  in  [Ref.  5] 
and  are  expected  to  work  with  the  next  S?ADS  software 
release.  Thus  the  Appendix  A  software  will  need  a  fair 
amount  of  revamping  in  order  to  taxe  advantage  of  these  new 
functions  when  trey  are  available. 

2*  Programming  Techniques 

The  job  of  the  voice  application  programmer  is  to 
write  a  Fascal  procedure  with  the  rare  "application".  Cnly 
that  name  will  suffice  as  the  procedure  is  imbedded  by  the 
SPADS  compiler  into  the  standard  operating  software  ror  the 
Vertex  3000  where  it  is  called  at  the  appropriate  time.  The 


NRISS  application  procedure  in  Appendix  A  uses  a  high  number 
or  labels  (35)  and  a  proportionate  number  of  GOTO  state¬ 
ments.  This  is  due  in  part  to  the  fact  tnat  it  was  not 
possible  to  define  functions  of  the  predefined  type  "string" 
such  as  will  be  available  from  Vertex,  and  in  part  to  the 
fact  that  the  GOTO  statement  is  efficient  and  saves  one  from 
indenting  off  the  right  side  of  the  page  in  a  highly  nested 
environment  which  easily  results  when  jumping  from  grammar 
to  grammar.  The  program  is  25,000  bytes  long  and  hence 
close  to  the  Vertex  3000  upper  limit  of  30,000  bytes. 3  For 
this  reason  and  to  allow  room  for  growth,  the  comments  have 
been  kept  to  a  minimum  but  are  intended  to  be  adequate  for 
the  purpose  of  future  updates  and  maintenance. 


3 SPADS  Training  Course 


III.  D  SER  * S  GUIDE 


To  use  the  Nk'ISS  continuous  voice  application  properly, 
one  must  invest  approximately  three  hours  both  in  learning 
the  operation  of  the  Veroex  3010  voice  terminal  and  in 
training  voice  patterns  to  the  grammars.  This  investment 
may  be  two  or  three  times  more  than  that  required  to  become 
proficient  at  inputting  NWI  SS  commands  from  the  keyboard 
(given  that  one  already  has  a  fair  amount  of  keyboard  expe¬ 
rience)  .  Nevertheless,  it  is  the  author’s  opinion  that  the 
investment  in  voice  input  will  more  than  pay  for  itself  in 
time  saved  and  reduced  aggravation  when  several  lengthy 
NKISS  sessions  are  to  be  played.  The  reasons  for  this 
opinion  have  already  been  stated  in  several  places  in  this 
thesis  and  stem  from  the  inherent  advantages  of  voice  input: 
naturalness,  speed,  hands-free  and  eyes-free  (relative  to  a 
keyboard)  input.  Further  it  is  difficult  to  output  mistakes 
in  the  sense  that  the  NNI3S  grammars  only  have  "correct" 
objects  to  be  output  and  substitution  errors  are  exceedingly 
rare  if  a  person  has  taken  the  necessary  time  to  train  voice 
patterns  properly.  This  chapter  will  move  sequentially 
through  the  steps  which  a  prospective  user  of  the  NWISS 
continuous  voice  application  should  follow  in  becoming 
proficient. 


A. 


LEARNING  OPERATION  OF  THE  VERB2X  3000 
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nwi ss/nviss 


on  the  Vertex  3000  associated  VT102  keyboard.  This  will 
cause  the  NWISS  application  to  be  loaded. 

B.  EBBOILING  THE  HWISS  VOC  ABOLAB I 

The  first  step  in  training  one’s  voice  patterns  on  the 
Vertex  3000  is  to  enroll  the  entire  vocabulary  (KVTISS  vocab¬ 
ulary  is  151  words)  at  one  tire.  This  means  that  the  Vertex 
3000  will  automatically  step  the  user  througn  all  151  words, 
requesting  each  to  be  spoken  twice,  occasionally  three 
times,  to  get  an  initial  set  of  voice  patterns  for  each 
word.  This  process  should  take  only  fifteen  minutes. 

In  order  to  make  the  enrollment  process  go  smoothly,  the 
user  snould  take  time  beforehand  to  Iook  at  the  vocabulary 
and  decide  what  pronunciation  will  be  liven  each  word.  As 
many  of  the  Ni/ISS  words  are  really  just  symbols  put  together 
to  make  an  aircraft  type,  weapon  type,  track  number,  call- 
sign,  et  cetera,  it  is  important  to  do  tnis  beforehand.  See 
Table  III  for  a  complete  UK  IS3  vocabulary  ordered  (column  by 
column)  the  same  way  as  the  Verbex  3001  will  present  it. 
Suggested  rules  of  thumb  for  t-ronucciation  are: 

1.  In  general,  use  the  most  natural  pronunciation  which 
ccmes  to  mind. 

2.  Pronounce  numbers  which  appear  as  part  of  fixes  iden¬ 
tifiers  naturally,  e.g.  "  F 1 4  A  "  as  "F  fourteen  A"  or 
"£K43"  as  "mark  forty-eight". 

3.  22  pronounce  digits  which  appear  as  part  of  variable 
strings,  e.g.  callsigns,  tracx  numbers,  bearings,  et 
cetera,  as  individual  digits,  e.g.  "ALTITUDE  2500"  as 
" ALTIT'JDE  two  five  zero  zero". 
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.  Pronounce  N,  E/  W ,  and  S  as  "dash”,  "north", 
"east",  "west "  and  "south". 

5.  Do  not  use  the  phonetic  alphabet  for  individual 
letter  pronunciation.  It  is  not  natural  or  neces¬ 
sary.  For  example,  pronounce  "VAQ24"  as  "V  A  zero 
two  four"  not  "victor  alpha  ..." 

6.  Give  the  full  pronunciation  to  abbreviated  ship 
names,  shore  base  names,  and  weapon  names,  e.g. 
"Spruance",  "Iomahawk",  "Harpoon",  "Sidewinder", 
"Misawa".  ("Kitty"  is  perfectly  acceptable  fcr 
"Xittyhawk")  . 

C.  TBAINING  THE  GRAMMARS 

The  next  step  after  enrollment  is  to  train  the  NVISS 
grammars.  This  means  that  the  Vertex  3000  will  automati¬ 
cally  step  the  user  through  a  large  number  of  triplets  (3 
words  in  a  phrase)  .  This  training  can  be  tedious  especially 
with  the  number  of  digits  used  in  NWIS3.  However  it  is  very 
important  to  accomplish  this  training  properly  to  get  good 
recognition  results.  (After  the  enrollment  phase,  one  could 
choose  to  test  his  or  her  recognition  accuracy  or.  the  Vertex 
3000  and  would  find  scores  ranging  around  50  or  60.  After 
the  training  phase,  testing  is  automatically  invoked  and 
should  show  recognition  scores  in  the  90's.) 

To  make  this  training  less  tedious,  the  following  change 
has  teen  made  to  the  Vertex  3000  scheme  of  training:  the  32 
character  display  on  the  User's  Console  will  ask  whether  it 
is  desired  to  train 

ALL  GEAHHAF.S? 

The  only  correct  response  is  HO.  It  will  then  ask  the  user 
which  grammars  to  train  individually.  This  is  the  desired 


mode.  It  allows  the  user  to  take  a  break  in  between  gram¬ 
mars  for  as  long  as  desired.  With  "allgrammars" ,  the 
machine  retains  no  memory  of  where  one  leaves  the  ordered 
list  of  triplets  and,  in  essence,  a  marathon  training 
session  is  implied.  For  this  reason,  and  to  conserve  space, 
the  NWISS  "allgrammars"  is  merely  a  shell  with  but  a  few 
entries  to  please  SPADS'  expectations.  Training  the  gram¬ 
mars  individually  should  take  about  15  or  20  minutes  each, 
depending  on  size.  The  Digits  grammar  is  last  on  the  list 
and  oay  be  left  untrained  as  the  user  will  get  to  train  many 
digits  in  the  other  grammars.  However  if  digits  ever  sees  a 
problem,  then  it  may  be  worthwhile  to  train  Digits  as  well 
as  the  other  grammars. 

D.  TESTING 

After  each  grammar  has  been  trained,  the  Veroex  3000 
will  ask  the  user  if  testing  is  desired.  This  is  a  worth¬ 
while  twc-minute  exercise  in  which  the  Verbex  3000  displays 
complete  legal  paths  (not  just  triplets)  through  the  grammar 
and,  after  the  user  has  spoken  each,  displays  the  recogni¬ 
tion  score  foe  that  utterance.  Scores  should  generally  be 
in  the  SO's  with  a  few  30's.  Scores  in  tne  70's  and  oelow 
may  indicate  retraining  is  needed. 

However  complete  paths  through  grammars  are  not  complete 
NWISS  commands.  Users  should  test  their  "feel"  for  grammar 
boundaries,  where  pauses  are  required,  by  testing  on  NWISS 
itself  (after  training  all  grammars  and  prior  to  beginning 
operation).  Figure  3.1  contains  a  fairly  representative 
sample  of  NWISS  commands  which  the  user  should  attempt. 
Pauses  are  indicated  by  "..."  and  prompts  are  inside 
parentheses. 
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"EXECUTE  FI  4STRCAP" 


Figure  3. 1  NWISS  Test  Commands 


E.  OPERATIONAL  USE 

After  the  enrollment  and  training  phases  are  over,  the 
interesting  phase  begins:  actual  input  to  NWISS.  What 
follows  are  a  few  suggestions  to  make  this  phase  easier  and 
hopefully  trouble-free.  In  general  one  should  enter  the 
SWISS  commands  in  accordance  with  the  display  messages 
(which  appear  on  the  User's  Console  every  time  the  Vertex 
3000  must  leave  one  grammar  and  call  in  another)  and  with 
the  "feel"  for  those  grammars  obtained  from  training  the 
triplets.  Speech  is  continuous  within  a  grammar  but 
discrete  across  grammar  boundaries.  A  f ew  guidelines  are: 

1.  A  pause  is  always  required  after  saying  " FOR 
<addressee> " ,  "DISPLAY",  "FORCE,  "TRACK",  or 
"POSITION".  (Wait  for  the  appropriate  display  prompt 
before  continuing). 

2.  When  speaking  a  field  of  digits,  prepare  ahead  of 
time  what  they  are  and  speak  them  continuously 
without  pausing  in  the  middle.  However  this  is  not 
true  of  positions  (latitude  and  longitude),  aircraft 
callsigns,  or  track  numbers  (i.e.  FORCE,  TRACK,  and 
POSITION)  which  are  defined  as  fixed  lengtn  fields 
and  may  be  entered  as  discretely  or  continuously  as 
desired . 

3.  Simple  commands  which  have  only  one  argument  of 
digits  should  be  spoken  continuously,  e.  g .  "SPEED 
35",  "RADIUS  250",  or  " ALT IIUD E  2000". 

4.  Fcr  the  more  difficult  commands  (i.e.  FIRE,  LAUNCH, 
BARRIER ,  and  STATION)  the  command  keyword  itself 
serves  as  an  entry  point  to  otner  grammars  and  appli¬ 
cation  code  and  hence  a  pause  is  required  after  it. 
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5.  Under  "LAUNCH"/  <event-name>  must  be  specified  as  the 
aircraft  type  plus  a  single  digit,  e.g.  "F14A1"  cr 
"E3C2". 

6.  Conversation  with  other  players  can  easily  trigger 
recognition  in  the  Verbex  3000  and  cause  unwanted 
output  to  NWISS.  This  can  be  prevented  by  swinging 
the  headset  microphone  away  and  covering  it  with  the 
hand  or  pressing  the  "STOP"  button  on  the  User’s 
Console.  This  latter  method  is  most  effective  as  it 
stops  the  Verbex  3000  from  listening  and  is  easy  to 
clear:  simply  press  the  "YES"  key  in  response  to  the 
"CONTINUE?"  message  on  the  display. 

7.  Don’t  panic  if  the  above  happens.  "CONTE  OL_K"  can  be 
issued  from  anywhere  and  will  return  the  process  back 
to  Nwisgraml  ("ENTER  NKISS  COMMAND  PLEASE"  is 
displayed) . 
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17.  CO  NC  IDS  IONS  AND  R£COMH£HDATIOHS 


The  scope  of  this  thesis  was  confined  to  the  analysis  of 
NWISS  command  syntax  and  the  creation  of  continuous  voice 
application  software  to  meet  the  requirements  developed  from 
that  analysis.  The  result  is  that  NWISS  users  can  input 
their  commands  using  their  most  natural  mode  of  communica¬ 
tion,  speech,  as  the  means  of  input.  While  the  thesis 
objectives  are  therefore  satisfied  in  this  practical  sense, 
whether  the  higher  goal  or  achieving  a  truly  effective 
improvement  in  man-machine  interaction  nas  been  acnieved 
cannot  be  known  until  independent,  operational  testing  is 
conducted.  At  this  time  no  one  but  tne  author  nas  trained 
their  voice  patterns  on  the  Verbex  30DD  and  exercised  the 
SWISS  continuous  voice  application. 

Hence  the  principal  recommendation  is  that  the  NWISS 
voice  application  be  used,  tested,  refined  and  improved  to 
ensure  that  the  above  goal  is  achieved.  Only  through  use 
will  the  pitfalls  be  found  and  corrected.  The  requirements, 
as  analyzed  here,  will  change  over  time  and  be  reinterpreted 
several  times  as  well.  New  SWISS  scenarios  are  certain  to 
be  developed  and  require  new  force  names,  base  names,  et 
cetera.  The  Orange  side  needs  to  nave  its  continuous  voice 
application  too;  hopefully  that  will  not  be  difficult  to  do 
in  lignt  of  this  thesis.  Thus,  like.,  any  original  software 
vorx,  the  SWISS  voice  application  will  need  to  be  modified, 
ir.  ail  likelihood  extensively,  and  toward  that  end  it  has 
been  defined  and  organized  through  this  thesis. 

Some  thought  should  perhaps  be  given  to  approaching  the 
problem  from  the  other  end:  how  should  SWISS  command  syntax 
be  changed  to  reflect  the  needs  oi  voice  input?  One  sugges¬ 
tion  is  that  SWISS  not  nandle  the  incoming  data  character  by 
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character/  as  from  a  keyboard,  bat  instead  "buffer"  and 
parse  the  entire  command:  the  user  can  then  "send"  the 
buffer  if  it  is  satisfactory  or  cancel  it.  Prompting  can  be 
adequately  provided  by  the  Verbex  3000  in  such  a  setup. 

Some  "requirements"  have  not  been  met.  In  particular 
the  ability  to  specify  the  "TIME"  of  a  command  is  not  avail¬ 
able.  This  is  the  result  of  a  design  judgment  that  it  would 
be  too  difficult  to  provide  and  is  seldom  used.  Another 
lac*  is  the  ability  to  get  help  (keyboard  "?")  at  any  place 
in  a  command.  The  problem  here  is  that  to  yet  the  desired 
effect,  every  grammar  must  have  a  multitude  of  legal  paths 
defined  which  end  in  "?".  However,  with  careful  study,  one 
might  be  able  to  redefine  some  of  the  iiWISS  grammars  to 
allow  "help"  to  be  spoken  in  the  middle  of  a  command  where 
it  might  most  be  needed.  Here  is  a  situation  where  perhaps 
NI-7ISS  modification,  such  as  having  a  separate  "HELP"  command 
whereby  one  would  specify  the  command  and/or  command  argu¬ 
ment  where  help  is  needed,  might  he  easier  to  accomplish. 

One  or  two  additional  grammars  would  be  required  on  the 
Verbex  3000  but  there  is  room  for  tnat.  Creation  of  such 
grammars  would  also  facilitate  implementing  the  "CAIiCZL" 
command  of  IIWISS  which  is  not  implemented  currently. 

Ihere  will  be  some  who  criticize  toe  grammar  boundaries 
as  either  being  misplaced  or  just  too  "discrete".  Either 
criticism  could  well  be  valid:  misplacements  can  be 
corrected  to  some  extent  but  the  length  of  pauses  for 
grammar  boundaries  will  be  more  difficult.  Only  faster 
processors  and  faster,  larger  memories  can  solve  this 
problem.  The  Verbex  3000  represents  tne  state  of  the  art 
(commercially)  today  in  terms  of  affordable  continuous  voice 
recognition  technology. 

Sometime  in  the  foreseeable  future  men  will  talk  natu¬ 
rally  to  machines  and  machines  will  talk  back  in  clear, 
understandable  prose.  Obviously  the  NWI33  continuous  voice 


application  is  far  removed  from  that  scenario,  but  it  repre¬ 
sents  one  of  many  steps  which  mast  be  taken.  To  the  extent 
that  computer-aided  wargames  such  as  NWISS  entail  a  nigh 
degree  of  man-machine  interaction  and  model  the  real  world 
where  military  decision  makers  must  make  real  time  decisions 
in  consort  with  a  computer,  they  serve  the  purpose  of 
promoting  man-computer  symbiosis.  Hopefully  this  demonstra¬ 
tion  of  continuous  voice  recognition  technology  will  form  a 
foundation  for  further  study  by  others  into  its  full  possi¬ 
bilities  in  advancing  the  state  or  man-machine  interaction. 
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